ASSETS-8997 add serviceoverload error reason #71

adamcin · 2022-05-24T19:31:18Z

jira: https://jira.corp.adobe.com/browse/ASSETS-8997
downstream dependent @adobe/asset-compute-sdk PR#182

This change adds a new rendition_failed reason (ServiceOverLoad) for use by asset compute workers that encounter upstream API rate limiting and need to indicate to downstream clients that a resubmission of the original asset compute request is necessary after some time has passed.

Also defined is a ServiceOverLoadErrorType, which extends ClientError rather than GenericError, because it is defined in the spirit of HTTP 4xx (429, specifically).

tmathern · 2022-05-26T18:02:59Z

lib/errors.js

@@ -121,6 +122,14 @@ class RenditionTooLarge extends ClientError {
    }
 }

+// Worker encountered upstream API rate limiting. Client may resubmit request after some time.
+class ServiceOverLoadError extends ClientError {


We have a similar error in api-process already (look for TooManyRequestsError). Could we merge those two classes into one instead, and refactor a bit where possible?

I based the ServiceOverLoad reason name on the design documents attached to the ASSETS-8997, though there are places where the error is listed as "TooManyRequests/ServiceOverLoad", and it wasn't clear if there was ambiguity over which error name to use throughout, or if it was intentional to have both. I personally see value in supporting both errors, since TooManyRequests is more readily associated with an HTTP 429 response originating from the Asset Compute Service itself, with the ability to provide a retry-after directive for the client, while ServiceOverLoad would represent a more general error type that Asset Compute can throw asynchronously when it encounters throttling from upstream/3rd-party services (such as when a worker receives a 429 Too Many Requests HTTP response).

If the AEM client receives either error, the proper behavior is to retry the original after some time has passed, but with TooManyRequests, the client may be given an explicit Retry-After, whereas with ServiceOverLoad it's basically

Retry-After: 🤷

I kind of had a hybrid approach in mind where we could support both of these error types in AEM for rendition_failed events just in case, and define both types in asset-compute-commons, along with making the semantic distinction more clearly defined along the lines I described above. Would that work?

It would work, but I'm not sure if having two different errors names initially was intentional or not. @pheenomenon probably can clarify if the two different errors where intended or are just "synonyms" (talking about current design, not what we'll have eventually).

Use of "TooManyRequests/ServiceOverLoad" was not meant to be the same. It was used so to only express the idea.

I have seen, our downstream services could get overloaded for a variety of reasons and return 500 instead of 429. So I like the idea of keeping it flexible as ServiceOverload instead of TooManyRequestsError.

To the question if TooManyRequestsError (we use in api-process for Nui throttling) should be converged to ServiceOverload - we can take that route if we want, but that won't have an API dependency with AEM and won't bring a huge advantage. So hybrid approach sounds good to me too.

In that case (which is also what confused me): Although 500 is generic, 503 should be ServiceOverload then (503 usually means server is busy - but our services don't use it yet as far as I know). Otherwise it could be confusing for developers using our APIs: Why do they get an Overload error when there is a 500 (which could be anything, since it's generic)?

(Could we maybe still move the TooManyRequests exception here too, while at it, @adamcin? If it doesn't throw you off-track?)

tmathern

Approving, with the note that ServiceOverload should be reserved for HTTP code 503, and not generic.

jdelbick

LGTM, please additionally update the readme with the new error type and description:
https://github.com/adobe/asset-compute-commons#custom-errors

adamcin added 3 commits May 24, 2022 12:21

ASSETS-8997 add serviceoverload error reason

b6787c3

ASSETS-8997 add unit test for serviceoverload

a8010da

ASSETS-8997 corrected class description

42ac6ae

adamcin mentioned this pull request May 25, 2022

ASSETS-8997 add support for serviceoverload rendition_failed reason adobe/asset-compute-sdk#182

Open

adamcin requested review from jdelbick, pheenomenon and tmathern May 26, 2022 16:44

tmathern requested changes May 26, 2022

View reviewed changes

pheenomenon approved these changes May 27, 2022

View reviewed changes

tmathern self-requested a review May 31, 2022 15:45

tmathern approved these changes May 31, 2022

View reviewed changes

jdelbick approved these changes May 31, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASSETS-8997 add serviceoverload error reason #71

ASSETS-8997 add serviceoverload error reason #71

adamcin commented May 24, 2022 •

edited

Loading

tmathern May 26, 2022

adamcin May 27, 2022

tmathern May 27, 2022 •

edited

Loading

pheenomenon May 27, 2022

tmathern May 31, 2022

tmathern May 31, 2022 •

edited

Loading

tmathern left a comment

jdelbick left a comment

ASSETS-8997 add serviceoverload error reason #71

Are you sure you want to change the base?

ASSETS-8997 add serviceoverload error reason #71

Conversation

adamcin commented May 24, 2022 • edited Loading

tmathern May 26, 2022

Choose a reason for hiding this comment

adamcin May 27, 2022

Choose a reason for hiding this comment

tmathern May 27, 2022 • edited Loading

Choose a reason for hiding this comment

pheenomenon May 27, 2022

Choose a reason for hiding this comment

tmathern May 31, 2022

Choose a reason for hiding this comment

tmathern May 31, 2022 • edited Loading

Choose a reason for hiding this comment

tmathern left a comment

Choose a reason for hiding this comment

jdelbick left a comment

Choose a reason for hiding this comment

adamcin commented May 24, 2022 •

edited

Loading

tmathern May 27, 2022 •

edited

Loading

tmathern May 31, 2022 •

edited

Loading